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Abstract. This paper describes a collaborative educational data mining tool based on 
association rule mining for the continuous improvement of e-learning courses 
allowing teachers with similar course’s profile sharing and scoring the 
discovered information. This mining tool is oriented to be used by instructors 
non experts in data mining such that, its internal operation is transparent to the 
user and the instructor can be focused in to the analysis of the results and make 
decisions about how to improve e-learning courses. 


Introduction 

Nowadays, there are a variety of general data mining tools and frameworks. Some 
examples of commercial mining tools are DBMiner [1], SPSS Clementine [2], DB2 
Intelligent Miner [3], etc. And some examples of public domain mining tools are Weka 
[4], RapidMiner [5], Keel [6], etc. All these tools are not specifically designed for 
pedagogical/educational purposes and it is cumbersome for an educator to use these tools 
which are normally designed more for power and flexibility than for simplicity. However, 
there are also an increasing number of mining tools specifically oriented to educational 
data such as: Mining tool [7] for association and pattern mining, MultiStar [8] for 
association and classification, EPRules [9] for association, KAON [10] for clustering and 
text mining, Synergo/ColAT [11] for statistics and visualization, GISMO [12] for 
visualization. Listen tool [13] for visualization and browsing, TADA-Ed [14] for 
visualizing and mining, 03R [15] for sequential pattern mining. Sequential Mining tool 
[16] for pattern mining, MINEL [17] for mining learning paths, Simulog [18] for looking 
for unexpected behavioral pattern. Moodle mining tool [19] for classification, clustering 
and association rule mining. All these tools are oriented to be used by a single instructor 
or course administrator in order to discover useful knowledge from their own courses. So, 
they don’t allow a collaborative usage in order to share all the discovered information 
between other instructors of similar courses (contents, subjects, educational type: 
elementary and primary education, adult education, higher, tertiary and academic 
education, special education, etc.). In this way, the information discovered locally by 
teachers could be joined and stored in a common repository of knowledge available for 
all instructors for solving similar detected problems. 

In this paper, we describe an educational data mining tool based on association rule 
mining and collaborative filtering for the continuous improvement of e-learning courses 
and it directed to teachers non experts in data mining. The main objective is to make a 
mining tool in which the information discovered can be shared and scored between 
different instructors and experts in education. 
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Implementation of the collaborative data mining tool 

We have developed a data mining tool with two subsystems: client and server application 
(Figure 1). The client application uses an association rule mining tool for discovering 
interesting relationships through student’s usage data in the form of IF-THEN 
recommendation rules. The server application uses a collaborative recommender system 
to share and score the previously obtained rules by instructors of similar courses with 
other instructors and experts in education. 



Figure 1. Collaborative data mining tool 

As we can see in Figure 1, the system is based on client-server architecture with N 
clients, which applies an association rule mining algorithm locally on students’ usage 
data. In fact, the client application uses the Predictive Apriori algorithm [20], because it 
does not require the user to specify parameters such as the minimum support threshold or 
confidence values. The only parameter is the number of rules to be discovered, which is a 
more intuitive parameter for a teacher non expert in data mining. The association rules 
discovered by the client application must be evaluated to decide if they are relevant or 
not, therefore the client application uses an evaluation measure [21] to classify the rules 
as being expected or unexpected, comparing them with the scored rules stored in a 
collaborative rules repository maintained on server side. Also, the expected rules found 
are then expressed in a more comprehensible format of recommendation about possible 
solutions to problems detected in the course. The teacher sees the recommendation and 
can determine if it is relevant or not for him/her in order to apply/use the 
recommendation. On the other side, the server application allows managing the rules 
repository using collaborative filtering techniques with knowledge-based techniques [21]. 
The information in the knowledge base is stored in form of tuples (rule-problem- 
recommendation-relevance) which are classified according to a specific course profile. 
The course profile is represented as a three-dimensional vector related with the following 
characteristic of his/her course: Topic (the area of knowledge, e.g. Computer Science or 
Biology); Level (level of the course, e.g. Universitary, High School, Elementary or 
Special Education); and Difficulty (the difficulty of the course, e.g.. Low or High). These 
similarities between courses are available to other teachers to assess in terms of 
applicability and relevance. A group of experts in online education from University of 
Cordoba, Spain, propose the first tuples of the rule repository and also vote on those 
tuples proposed by other experts. On the other hand, teachers could discover new tuples 
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(in the client application) but these must be validated by the experts (in the sever 
application) before being inserted in the rule repository. 

1.1 Client application 

As we mentioned before, the client application is used by instructors in order to find 
association rules. The main feature of the client application is its specialization in 
educational environments. Before applying our mining algorithm, the data have to be pre- 
processed in order to adapt them to our specific data model. First, the teacher has to select 
the origin of the data to be mined. We have two different formats available for input data: 
1) the Moodle relational database, for teachers that work with Moodle as well as the 
INDESAHC authoring tool [22], so all our attributes are used directly; or 2) a Weka [4] 
ARFF text file, for teachers that use other LMSs and, therefore, other attributes. Also, the 
teacher can restrict the search field, we have also added a few parameters related with the 
analysis depth. Firstly, the teacher must select the level of granularity to carry out the 
analysis: course, unit, lesson or a specific table of the data base such as course-unit, 
course-lesson, course-exercise, course-forum, unit-exercise, unit-lesson, lesson-exercise 
among others. 

The rules repository (see Figure 2) is the knowledge database upon which the analysis of 
the discovered rules is based. Before running the algorithm, the teacher downloads from 
the server, the current knowledge database, according to his/her course profile. 



Figure 2. Rules repository panel 
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Finally, after downloading the rule repository and configuring the application parameters 
or using default values, the teacher executes the association rule algorithm. Then, client 
application shows the results obtained in a table (see Figure 3), with the following fields: 
rule (discovered IF-THEN rule), problem (detected by the rule), recommendation (about 
how to solve the problem), score (of experts and others instructors have set to the rule) 
and apply button (to use/apply the recommendation in his/her course). 



Figure 3. Results panel 

We have distinguished between two types of recommendations: 1) Active, if it implies a 
direct modification of the course content or structure; or 2) Passive, if it detects a more 
general problem in the course or unit and it advices the teacher to consult more specific 
recommendations related with these didactic resources. Active recommendations can be 
linked to: modifications in the formulation of the questions (see Figure 3) or the practical 
exercises/tasks assigned to the students; changes in previously assigned parameters such 
as course duration or the level of lesson difficulty; or the elimination of a resource such 
as a forum or a chat room. 
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1.2 Server application 

The server application is used by experts and instructors. The experts in education insert 
the tuples and they explicitly vote for them by indicating degrees of preference (see 
Figure 4). The teachers vote implicitly when they push the “Apply” button, in order to 
side-step one of the main problems for collaborative filtering systems, that is how to 
encourage teachers to vote or evaluate. In this case, if teachers apply one of the 
recommendations to their course, they are implicitly voting for this specific tuple. 


Tuple evaluation 

Rule: e_time = HIGH => e_score = LOW 

Rule profile: COMPUTER SCIENCE, UNIVERSITY, BASIC 

Problem detected: The exercise wording could be incorrect or ambiguous. 

The exercise hyperlink could be broken. 
Recommendations: 1. Verificate the exercise hyperlink 

2. Modify the exercise wording 

3. Eliminate the exercise 

Expert evaluation 


Evaluation criterions A| 

Very 

Low 

Low 

Normal High 

Very 

High 

1 . The rule comprenhensibility is 

0 

0 

0 

0 : 0 

2 . The rule suitability is 

0 

0 

0 

0 

0 

3 . The adjusting of the rule to the selected profile 
in the knowledge database is 

0 

0 

0 

0 

0 

Expert decision 

Evaluation criterions A 2 

Very 

Low 

Low 

Normal 

High 

Very 

High 

1 . My recommendation about to add this rule to the 
repository is 

0 

0 

0 

0 

0 

2 . My confidence in this decision is 

0 

0 1 0 

0 

0 

3 . My experience as expert in this rule profile is 

0 

0 0 

0 

0 


I Vote for the rule | 


Figure 4. Vote panel 

The server application is a web-based application for managing the knowledge database 
or tuple repository (see Figure 5). In order to access easily to all the editing options for 
the repository, a general course profile was created which is the profile used by the 
experts in educational domain. These experts have permission to introduce new tuples 
into the rule repository and vote explicitly for existing ones (see Figure 4). In order to 
allow information exchange (tuples) between client and server, we have developed a web 
service for downloading/uploading the repository. Each time a client application updates 
its repository, all the tuples are reordered in the repository. 

Finally, we must mention that an evaluation of this collaborative data mining tool can be 
found in [21]. 
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Figure 5. Server application interface 


2 Conclusions 

In this paper we have shown a data mining tool that uses association rule mining and 
collahorative filtering in order to make recommendation to instructors about how to 
improve e-learning courses. This tool enables to share and score the discovered rules by 
other teachers of similar courses. Currently, the mining tool has been only used by a 
group of instructors and expert involved in the development of the own tool. So, in the 
future we want to test the tool with several groups of external instructors and experts in 
order to can test the usability of the tool with external users. 
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